Back
A geometry-based reward for video diffusion alignment using pointwise reprojection error, enabling post-training (SFT/DPO) and training-free test-time scaling.
video diffusion
3d vision
geometric consistency
reward model
rlhf